AITopics | random matrix theory

Nonlinear random matrix theory for deep learning

Neural Information Processing SystemsMar-17-2026, 12:30:20 GMT

Neural network configurations with random weights play an important role in the analysis of deep learning. They define the initial loss landscape and are closely related to kernel and random feature methods. Despite the fact that these networks are built out of random matrices, the vast and powerful machinery of random matrix theory has so far found limited success in studying them. A main obstacle in this direction is that neural networks are nonlinear, which prevents the straightforward utilization of many of the existing mathematical results. In this work, we open the door for direct applications of random matrix theory to deep learning by demonstrating that the pointwise nonlinearities typically applied in neural networks can be incorporated into a standard method of proof in random matrix theory known as the moments method. The test case for our study is the Gram matrix $Y^TY$, $Y=f(WX)$, where $W$ is a random weight matrix, $X$ is a random data matrix, and $f$ is a pointwise nonlinear activation function. We derive an explicit representation for the trace of the resolvent of this matrix, which defines its limiting spectral distribution. We apply these results to the computation of the asymptotic performance of single-layer random feature methods on a memorization task and to the analysis of the eigenvalues of the data covariance matrix as it propagates through a neural network. As a byproduct of our analysis, we identify an intriguing new class of activation functions with favorable properties.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning

Neural Information Processing SystemsFeb-17-2026, 04:23:54 GMT

In order to overcome overestimation bias, ensemble methods for Q-learning have been investigated to exploit the diversity of multiple Q-functions. Since network initialization has been the predominant approach to promote diversity in Q-functions, heuristically designed diversity injection methods have been studied in the literature. However, previous studies have not attempted to approach guaranteed independence over an ensemble from a theoretical perspective.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

6e69ebbfad976d4637bb4b39de261bf7-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 20:15:22 GMT

matrix, projection, sketch, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Pennsylvania (0.04)
North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Evaluating Singular Value Thresholds for DNN Weight Matrices based on Random Matrix Theory

Nishikawa, Kohei, Shimizu, Koki, Hiroki, Hashiguchi

arXiv.org Machine LearningDec-16-2025

This study evaluates thresholds for removing singular values from singular value decomposition-based low-rank approximations of deep neural network weight matrices. Each weight matrix is modeled as the sum of signal and noise matrices. The low-rank approximation is obtained by removing noise-related singular values using a threshold based on random matrix theory. To assess the adequacy of this threshold, we propose an evaluation metric based on the cosine similarity between the singular vectors of the signal and original weight matrices. The proposed metric is used in numerical experiments to compare two threshold estimation methods.

matrix, singular value, weight matrix, (14 more...)

arXiv.org Machine Learning

2512.12911

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Nonlinear random matrix theory for deep learning

Neural Information Processing SystemsNov-21-2025, 14:18:11 GMT

Neural network configurations with random weights play an important role in the analysis of deep learning. They define the initial loss landscape and are closely related to kernel and random feature methods. Despite the fact that these networks are built out of random matrices, the vast and powerful machinery of random matrix theory has so far found limited success in studying them. A main obstacle in this direction is that neural networks are nonlinear, which prevents the straightforward utilization of many of the existing mathematical results. In this work, we open the door for direct applications of random matrix theory to deep learning by demonstrating that the pointwise nonlinearities typically applied in neural networks can be incorporated into a standard method of proof in random matrix theory known as the moments method. The test case for our study is the Gram matrix $Y^TY$, $Y=f(WX)$, where $W$ is a random weight matrix, $X$ is a random data matrix, and $f$ is a pointwise nonlinear activation function. We derive an explicit representation for the trace of the resolvent of this matrix, which defines its limiting spectral distribution. We apply these results to the computation of the asymptotic performance of single-layer random feature methods on a memorization task and to the analysis of the eigenvalues of the data covariance matrix as it propagates through a neural network. As a byproduct of our analysis, we identify an intriguing new class of activation functions with favorable properties.

deep learning, nonlinear random matrix theory, random matrix theory, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Nonlinear random matrix theory for deep learning

Jeffrey Pennington, Pratik Worah

Neural Information Processing SystemsNov-21-2025, 04:37:04 GMT

The list of successful applications of deep learning is growing at a staggering rate. Image recognition (Krizhevsky et al., 2012), audio synthesis (Oord et al., 2016), translation (Wu et al., 2016), and speech recognition (Hinton et al., 2012) are just a few of the recent achievements.

artificial intelligence, machine learning, neural network, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding Batch Normalization

Nils Bjorck, Carla P. Gomes, Bart Selman, Kilian Q. Weinberger

Neural Information Processing SystemsNov-20-2025, 15:42:15 GMT

Its tendency to improve accuracy and speed up training have established BN as a favorite technique in deep learning.

artificial intelligence, deep learning, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Generalization in Representation Models via Random Matrix Theory: Application to Recurrent Networks

Moakher, Yessin, Tiomoko, Malik, Louart, Cosme, Liao, Zhenyu

arXiv.org Machine LearningNov-10-2025

We first study the generalization error of models that use a fixed feature representation (frozen intermediate layers) followed by a trainable readout layer. This setting encompasses a range of architectures, from deep random-feature models to echo-state networks (ESNs) with recurrent dynamics. Working in the high-dimensional regime, we apply Random Matrix Theory to derive a closed-form expression for the asymptotic generalization error. We then apply this analysis to recurrent representations and obtain concise formula that characterize their performance. Surprisingly, we show that a linear ESN is equivalent to ridge regression with an exponentially time-weighted (''memory'') input covariance, revealing a clear inductive bias toward recent inputs. Experiments match predictions: ESNs win in low-sample, short-memory regimes, while ridge prevails with more data or long-range dependencies. Our methodology provides a general framework for analyzing overparameterized models and offers insights into the behavior of deep learning networks.

artificial intelligence, machine learning, regression, (15 more...)

arXiv.org Machine Learning

2511.02401

Country:

Asia > China (0.46)
Europe > France (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Analysing Multi-Task Regression via Random Matrix Theory with Application to Time Series Forecasting

Neural Information Processing SystemsOct-10-2025, 17:19:08 GMT

The principle of multi-task learning, which encompasses concepts such as "learning to learn" and "knowledge transfer," offers an efficient paradigm that mirrors human intelligence.

assumption, dataset, experiment, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Banking & Finance (0.92)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning

Neural Information Processing SystemsOct-9-2025, 07:50:04 GMT

In order to overcome overestimation bias, ensemble methods for Q-learning have been investigated to exploit the diversity of multiple Q-functions. Since network initialization has been the predominant approach to promote diversity in Q-functions, heuristically designed diversity injection methods have been studied in the literature. However, previous studies have not attempted to approach guaranteed independence over an ensemble from a theoretical perspective.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Filters

Collaborating Authors

random matrix theory

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Nonlinear random matrix theory for deep learning

SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning

6e69ebbfad976d4637bb4b39de261bf7-Paper.pdf

Evaluating Singular Value Thresholds for DNN Weight Matrices based on Random Matrix Theory

Nonlinear random matrix theory for deep learning

Nonlinear random matrix theory for deep learning

Understanding Batch Normalization

Generalization in Representation Models via Random Matrix Theory: Application to Recurrent Networks

Analysing Multi-Task Regression via Random Matrix Theory with Application to Time Series Forecasting

SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning